RNN Pytorch源码
RNN Pytorch源码
Applies a multi-layer Elman RNN with :math:tanh or ReLU non-linearity to an input sequence.
For each element in the input sequence, each layer computes the following function:
$$h_t = \text{tanh}(W_{ih} x_t + b_{ih} + W_{hh} h_{(t-1)} + b_{hh}) $$
where $h_t$ is the hidden state at time t
, $x_t$ is the input at time t
, and $h_{(t-1)}$ is the hidden state of the previous layer at time $t-1$ or the initial hidden state at time 0
. If attr nonlinearity` is relu, then ReLU is used instead of tanh.
- Inputs: $input$, $h_0$
$input$:
input of shape
(seq_len, batch, input_size)
, containing the features of the input sequence. The input can also be a packed variable length sequence. See :func:torch.nn.utils.rnn.pack_padded_sequence
or :func:torch.nn.utils.rnn.pack_sequence
for details.
$h_0$:
$h_0$ of shape
(num_layers * num_directions, batch, hidden_size)
: tensor
containing the initial hidden state for each element in the batch. Defaults to zero if not provided. If the RNN is bidirectional, num_directions should be 2, else it should be 1.
- Outputs: $output$, $h_n$
$output$
- output of shape
(seq_len, batch, num_directions * hidden_size)
, containing the output features ($h_t$) from the last layer of the RNN, for each $t$. If a classtorch.nn.utils.rnn.PackedSequence
has been given as the input, the output will also be a packed sequence. For the unpacked case, the directions can be separated usingoutput.view(seq_len, batch, num_directions, hidden_size)
, with forward and backward being direction0
and1
respectively. Similarly, the directions can be separated in the packed case.
- output of shape
$h_n$
$h_n$ of shape
(num_layers * num_directions, batch, hidden_size)
containing the hidden state for $t = seq_{len}$. Like output, the layers can be separated usingh_n.view(num_layers, num_directions, batch, hidden_size)
.
note
All the weights and biases are initialized from : $\mathcal{U}(-\sqrt{k}, \sqrt{k})$
where $k = \frac{1}{\text{hidden_size}}$- batch_first
- If
True
, then the input and output tensors are provided
as(batch, seq, feature_size)
. Default:False
Examples
input_size=10, hidden_size =20, num_layers=21
2
3
4rnn = nn.RNN(10, 20, 2)
input = torch.randn(5, 3, 10)
h0 = torch.randn(2, 3, 20)
output, hn = rnn(input, h0)
Time series representation (时间序列表示方法)
- (seq_len, batch, input_size)
- (batch, seq_len, input_size) # batch first